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INTRODUCTION 


This  reporc  discusses  an  investigation  into  aspects  of  software  reliability 
that  may  be  characterized  by  measurable  properties  of  the  program  source  code 
(static  analysis).  Through  examination  of  a  large  data  base,  size,  structural, 
and  syntactic  characteristics  of  software  modules  have  been  related  to  the 
module  histories  of  change.  These  relationships  can  be  used  as  specification 
criteria  for  module  characteristics  when  contracting  for  software.  Addition¬ 
ally,  they  can  be  used  to  estimate  maintenance  requirements,  to  compare  alter¬ 
nate  implementations,  to  estimate  changes  to  the  software,  and  as  input  to 
failure  models  in  order  to  estimate  software  MTFB  (Mean  Time  Between  Failure). 
These  applications  are  practical  in  a  FASP  (Facility  for  Automated  Software 
Production)  environment  (reference  (a))  because  the  program  source  code  is 
stored  in  a  data  base  and  the  tools  to  automatically  measure  its  size,  struc¬ 
tural,  and  syntactic  characteristics  fit  naturally  into  the  FASP  concept. 


GENERAL  DISCUSSION 


GOALS 


The  goal  of  this  study  was  to  determine  a  relationship  between  measurable 
characteristics  of  the  program  source  code  and  the  number  of  changes  that  are 
made  to  the  program  modules.  This  relationship  would  then  be  evaluated  as  the 
basis  for  a  program  module  acceptance  criterion. 

BACKGROUND 

Correct  operation  of  military  software  is  critical.  The  rapidly  increas¬ 
ing  use  of  software  in  tactical  systems  and  in  major  communication  systems  has 
resulted  in  initiatives  by  the  DOD  (Department  of  Defense)  and  the  Navy  to 
improve  the  reliability  of  its  software  and  to  control  the  cost  of  achieving 
that  reliability  (reference  (b)).  These  initiatives  include  evaluation  of 
software  quality  at  the  earliest  possible  moment  in  the  software  development 
cycle. 

The  cost  of  military  software  is  high.  DOD  software  costs  are  presently 
in  the  three  billion  dollars  per  year  range;  new  DOD  software  functions  are 
occurring  at  a  rate  of  67  percent  per  year  (reference  (c)).  About  75  percent 
of  the  cost  occurs  during  the  operation  and  maintenance  phase  of  the  system 
life  cycle  (references  (c),  (d),  and  (e)).  About  50  percent  of  the  remainder 
goes  into  the  various  developmental  activities  categorized  as  testing  (refer¬ 
ences  (d),  (f),  (g),  and  (h)). 

The  use  of  testing  to  detect  and  repair  software  errors  is  time  consuming 
and  expensive.  In  1972,  the  United  States  Air  Force  spent  over  750  million 
dollars  on  various  software  testing  activities  (reference  (d)).  Additionally, 
the  relative  cost  of  repair  for  software  errors  increases  exponentially  with 
time  into  the  software  life  cycle  (reference  (i)).  Therefore,  large  cost 
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leverage  can  be  expected  from  any  software  technique  that  reduces  the  testing 
or  maintenance  effort. 

By  using  measurable  characteristics  of  the  program  source  code  as  a  basis 
for  acceptance  of  the  delivered  program  modules,  poorly  implemented  software 
can  be  rejected  prior  to  the  testing  or  maintenance  phases.  Because  these 
characteristics  are  measurable,  the  acceptance  criterion  can  be  specified  as 
a  contractual  requirement.  This  measurement  would  not  identify  errors,  would 
not  guarantee  the  absence  of  errors,  and  would  not  eliminate  the  need  for  testing. 
Rather,  it  would  provide  the  ability  to  reject  the  program  source  code  until  it 
meets  a  minimum  standard  for  reliability. 


SUMMARY  OF  APPROACH 

The  overall  approach  of  this  study  has  been  to: 

1.  Identify  measurable  characteristics  of  the  source  program  that  are 
related  to  software  reliability. 

2.  Measure  these  characteristics  for  existing  military  software  and  collect 
data  for  the  corresponding  history  of  change  to  the  software. 

3.  Determine  a  relationship  between  the  measurable  characteristics  and 
the  changes. 

Identification  of  Measurable  Characteristics.  Measurements  may  be  performed 
upon  the  program  source  language  in  terms  of  its  size,  its  structure,  and  its 
syntax.  Size  properties  categorize  the  amount  of  the  program  through  counts 
of  fundamental  elements.  Examples  are  the  number  of  source  statements  and 
number  of  modules.  Structural  properties  categorize  program  organization  and 
flow  of  control.  Examples  are  program  hierarchy,  path  Information,  and  nesting 
level.  Syntactic  properties  categorize  the  source  language  and  its  use  in  the 
program.  Examples  are  statement  types,  use  of  terminal  symbols,  and  use  of 
operands.  McCabe  has  proposed  a  new  structural  measure  (reference  (j)). 

Halstead  has  proposed  measures  that  can  be  classified  as  either  size  or 
syntax  (reference  (k) ) . 

The  identification  portion  of  the  study  began  in  1974  with  an  attempt  to 
determine  structural  properties  of  modules  that  could  be  used  as  measures  of 
complexity.  The  properties  examined  were  the  number  of  nodes,  arcs,  paths, 
and  source  statements  present  in  each  module.  The  study  consisted  of  simu¬ 
lating  the  process  of  error  detection  in  a  module  as  a  function  of  exercising 
paths  through  the  module.  Control  structures  and  error  histories  for  44  pro¬ 
cedures  from  the  NTDS  (Navy  Tactical  Data  System)  program  were  used  as  input 
to  the  simulation.  The  results  are  reported  in  references  (1)  and  (m)). 

The  general  conclusion  of  this  initial  study  was  that  the  four  structural 
factors  had  potential  use  as  measures  of  complexity,  but  that  no  single  factor 
stood  out  as  being  a  major  determinant.  It  was,  therefore,  decided  to  include 
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syntax  in  the  measurable  properties  and  to  determine  if  the  program  complexity 
could  be  measured  as  a  function  of  two  or  more  parameters. 

Data  Collection.  There  are  a  number  of  problems  in  securing  size,  structure 
syntax,  and  reliability  data  for  large  military  software  systems.  One  problem 
is  that  these  software  systems  require  years  to  develop  and  are  expected  to 
have  a  long  operational  life.  Usage  of  the  software  is  required  to  produce 
reliability  data,  and  these  systems  rarely  exist  as  numerically  significant 
multiple  copies  operating  simultaneously.  It  is,  therefore,  necessary  to 
accept  the  long  term  cost  associated  with  establishing  and  maintaining  a  data 
collection  mechanism  over  both  the  development  period  and  some  portion  of  the 
maintenance  period. 

A  second  problem  is  technological.  Military  software  is  presently  devel¬ 
oped  on  equipment  configurations  and  under  operating  systems  that  lag  commercial 
practice  by  two  to  four  generations.  It  is  only  very  recently  that  environ¬ 
ments  such  as  FASP  have  been  created.  These  environments  allow  application  of 
automatic  tools  to  measure  the  software  characteristics,  maintain  the  results, 
and  control  the  software  development/maintenance  process.  Over  the  DOD,  this 
capability  is  limited.  Therefore,  on  present  projects,  data  collection  efforts 
will  be  manual,  clerical  activities.  This  has  cost  and  schedule  impact  which 
project  managers  must  accept. 

There  are  constraints  on  public  release  of  the  data  that  effectively  act 
to  limit  its  availability  for  studies.  In  those  cases  In  which  the  collection 
of  data  represents  an  expenditure  of  the  contractor's  resources,  the  information 
may  be  proprietary.  For  certain  military  systems,  the  data  may  be  classified 
since  it  reflects  on  the  reliability  of  those  systems. 

The  homogeneity  of  the  collected  data  is  affected  by  the  change  in  the 
technological  base  during  the  Interval  of  data  collection.  Over  the  software 
life  cycle  or  even  over  the  development  period,  military  software  efforts 
often  have: 

1.  The  target  hardware  change  from  a  prototype  system  with  one  set  of 
capabilities  to  a  production  system  with  a  different  set  of  capabilities. 

2.  The  Implementation  language  pass  through  a  number  of  revisions  or 
even  be  completely  respecified. 

3.  The  operating  system  environment  for  both  development  and  integration 
pass  through  radical  changes. 

4.  The  applied  software  engineering  methodologies  change  due  to  either 
selection  of  new  contractors  or  changes  in  approach  within  a  specific  con¬ 
tractor. 

These  effects  must  be  considered  both  in  terms  of  the  homogeneity  of  the 
data  being  analyzed  and  the  applicability  of  the  results  to  a  specific  software 
environment . 
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Another  factor  affecting  data  homogeneity  is  the  protean  nature  of  the 
source  program.  Over  the  life  cycle  of  the  software,  the  program  constantly 
changes  due  to  the  repair  of  failures  and  to  program  improvements.  Measured 
values  of  size,  structure,  and  syntax  will  vary  while  the  failure  data  is 
collected.  These  changes  are  not  necessarily  minor  and  present  a  difficulty 
in  interpretation  of  the  result. 

The  data  collection  portion  of  the  study  began  in  1973  as  part  of  an 
experiment  in  the  application  of  structured  programming  to  military  software. 

This  experiment  consisted  of  applying  various  software  engineering  methodologies 
(e.g.,  egoless  programming  teams,  top  down  Implementation,  structured  code)  to 
the  development  of  a  tactical  display  processing  program  for  the  CV-TSC  (Carrier 
Based  Tactical  Support  Center)  system  software.  The  effort  was  typical  of 
military  software  developments  in  that  the  assembly  language,  the  operating 
system,  the  computer,  and  the  display  system  were  all  prototypes.  The  final 
software  comprised  three  programs  totaling  578  modules  consisting  of  70,434 
words  of  instructions  resulting  from  126,755  lines  of  source.  The  results  of 
this  experiment  have  been  reported  in  reference  (n)). 

As  part  of  the  development  methodology,  detailed  documentation  at  the 
module  level  was  required  for  all  corrections  or  modifications.  This  documen¬ 
tation  began  with  approval  of  module  design  and  has  continued  through  implementa¬ 
tion,  testing  and  operation.  The  combination  of  availability  of  the  source 
program,  detailed  module  information  throughout  the  life  cycle,  and  close 
association  with  the  work  performed  resulted  in  the  selection  of  this  software 
system  as  the  subject  of  the  study.  The  raw  data  was  collected  and  analyzed 
in  1977. 

Additional  data  collection  efforts  have  been  started.  In  1975,  a  tool 
was  developed  to  measure  the  size,  structural,  and  syntactic  characteristics 
of  programs  written  in  the  Navy’s  CMS-2  language.  The  present  version  of  this 
tool,  SPAR  (Source  Program  Analyzer  and  Reporter),  has  been  used  to  analyze 
approximately  800,000  lines  of  source  code  from  Navy  tactical  programs.  It  is 
intended  to  use  this  information  in  concert  with  failure  histories  of  those 
programs  to  evaluate  the  initial  conclusions  of  this  study. 


Analysis .  The  module  source  language  has  been  measured  in  terms  of  software 
complexity  characteristics.  The  change  and  modification  histories  have  been 
analyzed  in  an  attempt  to  categorize  the  nature  of  the  change  as  "errors"  or 
"modifications."  Descriptive  statistics  for  the  various  quantities  have  been 
calculated.  Linear  regression  techniques  have  been  used  to  determine  relation¬ 
ships  between  the  measured  source  properties  and  the  various  values  for  change 
in  the  software.  The  relationships  have  been  evaluated. 


DATA 

Two  steps  were  required  to  determine  the  specific  values  of  the  module 
source  parameters  and  the  module  change  history  that  are  used  in  this  study. 
The  first  involved  collection  of  the  program  source  images  for  the  modules 
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and  collection  of  the  documents  that  describe  the  changes  to  those  modules 
during  development  and  operation.  The  second  consisted  of  analyzing  the 
source  images  to  determine  the  parameter  values  and  analyzing  the  change  docu¬ 
ments  to  determine  the  category  and  quantity  of  change  to  the  software  modules. 

COLLECTED  DATA 

The  source  program  images  and  the  corresponding  change  documentation  were 
collected  for  that  portion  of  the  CV-TSC  system  software  consisting  of  the 
tactical  display  processing  program.  Development  of  this  software  began  in 
June  1973.  Integration  of  the  tactical  display  processing  program  had  been 
completed  in  the  system  integration  facility  by  September  1974.  Initial 
delivery  as  part  of  the  CV-TSC  system  software  occurred  in  December  1974. 

The  tactical  display  processing  program  comprises  340  modules  consisting 
of  35,018  words  of  instructions  resulting  from  75,692  lines  of  source.  The 
source  images  were  collected  in  April  1977  from  the  CV-TSC  system  library 
identified  as:  "Set  J,  15  June  1977."  These  images  were  processed  by  various 
translators  and  system  routines  and  now  exist  as  files  in  the  NAV AIRDEV CEN  CCS 
(Central  Computer  System) . 

A  requirement  of  the  development  methodology  was  that  detailed  documenta¬ 
tion  at  the  module  level  be  provided  for  all  corrections  or  modifications  to 
approved  design  or  code.  These  documents  are  called  "change  requests."  They 
are  hand-written  and  provide  various  amounts  of  information  concerning  the 
change.  Appendix  A  is  an  example  of  a  typical  change  request. 

Approximately  800  change  requests  were  available  in  April  1977.  These 
spanned  3*s  years  from  October  1973  to  the  end  of  March  1977.  The  last  2*s 
years  represented  the  period  of  operational  usage  and  program  maintenance.  Only 
a  small  number  of  these  800  change  requests  fail  to  Involve  a  change  in  the 
program  source.  The  major  problem  in  evaluating  them  lies  in  distinguishing 
between  changes  to  correct  an  error  and  modifications  to  improve  the  program. 


ANALYZED  DATA 

This  data  resulted  from  examination  of  the  change  requests  and  analysis  of 
the  source  images  of  the  tactical  display  processing  program.  The  effort  re¬ 
quired  to  examine  the  change  requests  constrained  the  data  to  a  portion  of  the 
program.  This  portion  consisted  of  all  tactical  display  processing  software 
with  the  exception  of  those  modules  used  to  define  the  global  data  base  and 
those  modules  that  comprised  the  special  purpose  executive. 

All  of  the  source  image  data  collected  in  April  1977  were  processed  by 
an  analyzer  that  counted  various  size  parameters  and  provided  detailed  data  on 
the  use  of  the  syntax.  Review  of  the  listings  for  those  modules  corresponding 
to  the  examined  change  requests  provided  additional  counts  of  structural  para¬ 
meters. 
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The  resulting  data  were  condensed  Into  the  final  set  of  size,  structural, 
and  syntactic  parameters  and  stored  as  a  data  base  in  the  NAVAIRDEVCEN  CCS. 

The  stored  data  consist  of  parameter  values  and  change  values  for  272  modules 
and  61  associated  submodules  comprising  16,915  words  of  instructions  resulting 
from  53,614  lines  of  source.  There  are  a  total  of  462  changes. 


Modules  and  Submodules.  The  program  development  methodology  specified  the 
module  to  be  the  basic  source  unit  building  block  for  the  program.  A  module 
was  defined  as  a  program  unit  that  was  discrete  and  identifiable  with  respect 
to  assembly,  combining  with  other  units,  and  loading.  Due  to  the  development 
methodology,  modules  usually  performed  a  single,  specific  function  (i.e.,  had 
functional  strength) . 

There  were  cases  in  which  a  parent  module  called  various  subordinate 
modules  to  perform  related  functions  on  a  conmon  data  structure.  In  order  to 
hide  the  data  structure,  the  subordinate  modules  were  physically  located  within 
the  source  Image  of  the  parent  module,  (i.e.,  the  modules  had  informational 
strength).  In  these  cases,  the  subordinate  modules  were  termed  "submodules." 
Although  only  modules  existed  as  identifiable  entities  to  the  operating  system, 
the  submodules  were  treated  as  distinct  and  identifiable  units  with  regard  to 
the  methodology  of  the  development  cycle. 

For  the  purposes  of  this  study,  modules  and  submodules  are  treated  identi¬ 
cally.  In  the  rest  of  this  report,  the  term  "module"  will  mean  "module  or 
submodule"  unless  specifically  noted  otherwise. 

Change  Request  Data.  Each  change  request  was  examined  to  determine  which  modules 
were  affected.  For  each  module,  six  items  of  information  were  recorded: 

1.  Change  request  identifying  number. 

2.  Date  of  change  request. 

3 .  Implementing  engineer . 

4 .  Approving  engineer . 

5.  Primary  area  of  change  (design,  code,  unknown). 

6.  Primary  reason  for  change  (error,  modification,  unknown). 

A  sample  data  sheet  can  be  found  in  appendix  B. 

If  the  change  was  known  to  be  a  change  in  code  but  not  a  change  in  design, 
the  area  of  change  was  designated  as  "code."  If  the  change  was  known  to  be  a 

design  change  as  well  as  a  change  in  code,  it  was  designated  as  "design." 

Otherwise,  the  change  area  was  designated  as  "unknown."  This  latter  case 
occurred  in  three  Instances. 

If  the  reason  for  the  change  was  a  known  error  as  indicated  on  the  change 

request  or  recalled  by  one  of  the  engineers,  it  was  designated  as  an  "error." 
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If  the  change  was  a  known  modification  (in  the  sense  of  an  "improvement"  to 
the  original  program)  as  indicated  on  the  change  request  or  recalled  by  one 
of  the  engineers  it  was  designated  as  a  "modification."  Otherwise,  the  reason 
for  change  was  designated  as  "unknown." 

From  this  Information,  the  following  values  were  calculated  and  stored 
by  module  in  a  database  in  the  CCS: 

1.  Total  number  of  changes  (Total  Changes). 

2.  Total  number  of  changes  in  design. 

3.  Total  number  of  changes  in  code  only. 

4.  Total  number  of  unknown  change  areas. 

5.  Total  number  of  known  errors  (Known  Errors). 

6.  Total  number  of  known  modifications. 

7.  Total  number  of  unknown  reasons  for  change. 

8.  Possible  Errors  (Defined  as:  Total  Changes  -  Known  Modifications). 

For  the  purposes  of  the  study.  Total  Changes,  Known  Errors,  and  Possible 
Errors  were  selected  as  the  measures  of  change  to  the  program  modules. 

Source  Code  Data.  The  source  code  data  resulted  from  the  output  of  a  program 
that  counted  size  and  syntactical  parameters  and  from  examination  of  the  module 
listings  for  counts  of  structural  parameters.  Thirty  parameters  were  collected 
for  each  module: 

1.  Number  of  cards. 

2.  Number  of  comment  cards. 

3.  Number  of  processed  cards. 

4.  Registers:  Number  of  unique  references. 

5.  Registers:  Total  number  of  references. 

6.  Op-Codes:  Non-branching:  Number  of  unique  references. 

7.  Op-Codes:  Non-branching:  Total  number  of  references. 

8.  Op-Codes:  Branching:  Number  of  unique  references. 

9.  Op-Codes:  Branching:  Total  number  of  references. 
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10.  Op-Codes:  Directives:  Number  of  unique  references. 

11.  Op-Codes:  Directives:  Total  number  of  references. 

12.  Op-Codes:  Total  number  of  unique  references. 

13.  Op-Codes:  Total  number  of  references. 

14.  Variables:  Local  names:  Number  of  unique  references. 

15.  Variables:  Local  names:  Total  number  of  references. 

16.  Variables:  Global  names:  Number  of  unique  references. 

17.  Variables:  Global  names:  Total  number  of  references. 

18.  Variables:  Numbers:  Number  of  unique  references. 

19.  Variables:  Numbers:  Total  number  of  references. 

20.  Variables:  Local  expressions:  Number  of  unique  references. 

21.  Variables:  Local  expressions:  Total  number  of  references. 

22.  Variables:  Global  expressions.  Number  of  unique  references. 

23.  Variables:  Global  expressions:  Total  number  of  references. 

24.  Variables:  Total  number  of  unique  references. 

25.  Variables:  Total  number  of  references. 

26.  Total  number  of  ITE's  (IF-THEN-ELSE's) . 

27.  Total  number  of  DOW’s  (DOWHILE's). 

28.  Total  number  of  DOU's  (DOUNTIL's). 

29 .  McCabe  Number . 

30.  Program  Clarity. 

Various  parameters  are  described  below.  Actual  values  for  each  module  can  be 
found  in  appendix  C. 

Card  Counts  (Items  1,  2,  3) .  The  number  of  cards  (item  1)  includes  all  proces¬ 
sed  cards,  all  comment  cards,  all  blank  cards,  and  three  cards  related  to  job 
control  (header,  assembler  on,  assembler  off).  A  processed  card  (item  3)  is 
a  source  image  containing  an  op-code  or  a  directive.  It  is  identical  to  the 
total  number  of  op-code  references  (item  13). 
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Op-Code  Counts  (Items  6-13) .  Directives  (items  10,  11)  are  assembler  pseudo¬ 
ops.  The  total  references  (items  12,  13)  are  the  sums  of  nonbranching  op¬ 
codes  (items  6,  7),  branching  op-codes  (items  8,  9),  and  directives.  The 
total  number  of  op-code  references  (item  13)  is  identical  to  the  number  of 
processed  cards. 

Variable  Counts  (Items  14-25).  Local  names  are  variables  defined  internally 
to  the  module;  global  names  are  defined  externally.  Global  names  include  the 
names  of  other  modules  when  referenced. 

Numbers  have  been  counted  as  unique  parameters  when  they  were  used  as 
literals  or  as  the  operand  in  an  equivalence  statement.  They  were  not  counted 
when  part  of  an  expression  defining  addresses  or  constants. 

Expressions  in  the  operand  field  of  the  assembly  statement  result  in 
values  that  represent  single,  unique  items,  whether  they  be  addresses  or  constants. 
Therefore,  expressions  have  been  counted  as  single  items  and  treated  in  the  analy¬ 
sis  as  if  they  were  an  alternate  form  of  spelling  for  a  variable. 

The  total  number  of  references  (items  24,  25)  are  the  sums  of  local 
variables,  global  variables,  numbers,  local  expressions,  and  global  expressions. 


Structured  Programming  Constructs  (Items  26,  27,  28).  These  are  counts  of  the 
occurrences  in  the  modules  of  the  standard  structured  programming  constructs. 

McCabe's  Number  (Item  29).  This  is  the  cyclomatic  complexity,  V(G),  proposed 
by  McCabe  in  reference  (j) .  It  is  a  structural  measure  that  may  be  viewed  as 
the  number  of  Independent  paths  in  the  module.  For  a  module  with  a  directed 
graph,  G,  containing  n  vertices,  e  edges,  and  p  connected  components,  it  is 
defined  as: 

V(G)  -  e  -  n  +  p 

McCabe  shows  that  for  structured  programs  the  cyclomatic  number  is  equal 
to  one  more  than  the  number  of  predicates  (i.e.,  decision  points);  therefore, 
this  item  was  equal  to  the  sum: 

ITE  (item  26)  +  DOW  (item  27)  +  DOU  (item  28)  +  1 

Program  Clarity  (Item  30).  Halstead,  in  reference  (k) ,  suggests  a  number  of 
measures  that  could  be  viewed  as  size  or  syntactic  parameters.  His  suggested 
parameter,  E,  defined  as  "the  total  number  of  elementary  mental  discriminations 
required  to  generate  a  given  program"  was  chosen  as  a  possible  measure  of  module 
clarity.  It  was  calculated  from  the  approximation: 

E  _  [(Nj  +  N2)  log2  (ni  +  P2>3  p1n2 

2n2 
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where:  N^  ■  Total  Operator  References. 

■  Total  number  of  op-codes  referenced  (item  13). 

N£  *  Total  Operand  References. 

■  Total  number  of  register  references  (item  5)  +  total  number  of 
variable  references  (item  25) . 

n^  *  Number  of  Unique  Operators. 

*  Total  number  of  unique  op-code  references  (item  12) . 

n2  ■  Number  of  Unique  Operands. 

■  Total  number  of  unique  register  references  (Item  4)  +  total 
number  of  unique  variable  references  (item  24) . 


ANALYSIS 


Analysis  of  the  data  consisted  of  calculating  descriptive  statistics  for 
the  data,  calculating  correlation  coefficients  between  data  items,  and  deter¬ 
mining  linear  and  nonlinear  relationships  between  the  data  parameters  (measures 
of  size,  syntax,  and  structure)  and  measures  of  changes  to  the  software.  The 
primary  goal  of  the  analysis  was  to  determine  a  functional  relationship  that 
could  be  used  to  accept  or  reject  source  modules  on  the  basis  of  predicted 
change  to  the  module. 

Analysis  was  performed  on  the  NAVAIRDEVCEN’s  CCS.  This  is  a  large-scale, 
third  generation  commercial  system  consisting  of  two  CDC  (Control  Data  Corpora¬ 
tion)  6600's  and  a  CYBER  170/Model  175.  Calculations  were  performed  using  the 
statistical  library  supplied  as  part  of  the  operating  system  (reference  (o)). 

DESCRIPTIVE  STATISTICS 

There  are  38  data  items.  Thirty  are  parameter  values  measured  from  the 
source  image  of  the  modules,  and  eight  are  counts  of  change  to  the  modules  as 
evaluated  from  the  change  requests.  From  the  eight  counts  of  change,  TC  (Total 
Changes),  BCE  (Known  Errors),  and  FE  (Possible  Errors)  were  selected  as  the 
measures  of  change. 

The  sum,  the  mean,  the  variance,  and  the  standard  deviation  over  the  333 
data  samples  were  calculated  for  each  of  the  38  data  items.  For  each  measure 
of  change  (TC,  BCE,  and  PE)  the  data  items  were  grouped  by  value  of  that  measure. 
Then  the  sum,  the  mean,  the  variance,  and  the  standard  deviation  were  calculated 
for  each  data  item  in  each  group. 

The  calculations  were  performed  to  provide  a  general  understanding  of  the 
data,  to  aid  in  selection  of  significant  parameters  for  the  linear  and  nonlinear 
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fits,  and  as  a  check  on  steps  In  the  linear  regression  calculations.  The 
results  are  listed  in  appendix  D. 

CORRELATION  COEFFICIENTS 

Correlation  coefficients  were  calculated  between  each  pair  of  the  38  data 
items.  The  calculations  were  performed  to  aid  in  identifying  significant  para¬ 
meters  and  to  aid  in  reducing  the  parameter  sets  for  the  multiparameter  linear 
fits  and  the  nonlinear  fits.  The  correlation  coefficients  between  the  38  data 
items  and  the  measures  of  change  (TC,  KE,  and  PE)  are  presented  in  appendix  E. 

LINEAR  RELATIONSHIPS 

Two  groups  of  linear  fits  were  calculated.  The  first  group  involved 
selecting  pairs  of  significant  parameters  from  the  set  of  30  parameters  and 
using  multiple  regression  analysis  to  fit  the  pairs  to  each  of  the  three 
measures  of  change  (TC,  KE,  and  PE).  Two  parameters  were  chosen  to  represent 
size  or  syntax  and  two  parameters  were  chosen  to  represent  structure.  The 
four  resulting  size/structure  pairs  were: 

1.  Processed  Source  (PS)) /McCabe  Number  (MN) . 

2.  Processed  Source  (PS) /Total  Branching  Instructions  (TBI) . 

3.  Program  Clarity  (PC) /McCabe  Number  (MN) . 

4.  Program  Clarity  (PC) /Total  Branching  Instructions  (TBI). 

The  second  group  of  linear  fits  involved  using  multiple  regression  analysis 
to  fit  about  two-thirds  of  the  data  parameters  to  each  of  the  measures  of  change. 
Elimination  of  parameters  from  the  set  to  be  fitted  was  subjective.  Guidelines 
generally  followed  were  to  eliminate: 

1.  Parameters  that  essentially  duplicated  each  other. 

2.  Parameters  that  represented  a  significant  component  of  another  parameter. 

3.  Parameters  that  were  highly  correlated  with  a  parameter  already  included. 

Two  Parameter  Linear  Fit.  The  goal  was  to  find  a  fit  of  the  form: 

ci  •  Klsi  +  K2Gi  +  k3 
where:  C  is  a  measure  of  change 

S  is  a  size/syntax  parameter 
G  is  a  structure  parameter 
K's  are  constants. 
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The  fit  was  then  used  to  predict  values  of  change,  and  the  predictions  were 
compared  with  the  actual  values  in  order  to  evaluate  the  linear  function  as  an 
acceptance/rejection  criterion  for  the  modules. 

Appendix  F  contains  the  12  sets  of  results.  Each  set  includes  the  defining 
function,  a  quantized  scatter  diagram  of  predicted  values  versus  actual  values, 
and  an  evaluation  of  the  prediction.  The  evaluation  indicates  the  percent  of 
correct  acceptance/rejection  and  the  percent  of  incorrect  acceptance/rejection 
for  various  levels  of  rejection. 


Multiparameter  Linear  Fit.  The  data  parameters  were  fitted  to  the  three  measures 
of  change  to  achieve  functions  of  the  form: 


Ci  *  ^  KjPiJ 

where:  The  m  values  of  P^  represent  the  set  of  selected  data  parameters  as 
described  above. 

P^  is  one  of  the  size/syntax  parameters  (PS  or  PC) . 

P2  is  one  of  the  structure  parameters  (MN  or  TBI) . 

The  K's  are  constants. 

As  the  number  of  independent  variables  in  the  regression  formula  is  increased, 
the  residual  error  for  the  predicted  change  value  of  the  specific  data  set  de¬ 
creases.  Thus,  the  resulting  multiparameter  function  can  be  used  to: 

1.  Indicate  an  upper  bound  to  the  quality  that  can  be  expected  for  a 
linear  fit. 

2.  Indicate  how  well  the  parameters  S  and  G  (two  parameter  linear  fit) 
represent  the  characteristics  of  the  larger  set  of  parameters. 

Appendix  G  contains  the  12  sets  of  results.  Each  set  includes  the  defining 
function,  a  quantized  scatter  diagram  of  predicted  values  versus  actual  values, 
and  an  evaluation  of  the  prediction.  The  evaluation  indicates  the  percent  of 
correct  acceptance/rejection  and  the  percent  of  Incorrect  acceptance/rejection 
for  various  levels  of  rejection. 

NONLINEAR  RELATIONSHIPS 

The  goal  was  to  determine  if  a  nonlinear  function  would  Improve  the  pre¬ 
diction  of  the  measures  of  change.  The  functional  form  selected  was: 

C*  -  S®  G?  t  Kj  Pij 
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where:  C  is  che  measure  of  change. 

S  and  G  are  the  significant  size  or  structure  parameters  as 
described  above. 

The  Pj.'s  are  the  remaining  data  parameters  selected  as  described 
above  and  include  the  constant,  "1." 

The  K's,  a,  and  $  are  constants. 

The  values  for  a  and  6  were  determined  through  multiple  regression  analysis 
on  the  data: 

log10  (Ci  +  “  alog10Si  +  01oglOGi  +  Y 

Appendix  H  contains  the  12  sets  of  results.  Each  set  includes  the  defining 
function,  a  quantized  scatter  diagram  of  predicted  values  versus  actual  values, 
and  an  evaluation  of  the  prediction.  The  evaluation  indicates  the  percent  of 
correct  acceptance/rejection  and  the  percent  of  incorrect  acceptance/rejection 
for  various  levels  of  rejection. 


RESULTS 

Results  are  discussed  for  the  areas  of  descriptive  statistics  of  the  data, 
correlation  coefficients,  and  functional  relationships  resulting  from  the  multiple 
regression  analysis.  It  should  be  emphasized  that  these  results  are  based  upon 
data  representing  a  unique  software  development  environment  (e.g.,  language, 
support  facilities,  methodologies,  personnel).  Their  general  applicability 
must  be  further  considered  in  terms  of  the  similarity  of  other  software  develop¬ 
ment  environments  to  the  environment  that  has  been  studied. 

DESCRIPTIVE  STATISTICS 

Appendix  D  contains  four  sets  of  descriptive  statistics  of  the  data.  The 
first  set  consists  of  the  sum,  mean,  variance,  and  standard  deviation  of  the 
data  items  over  all  samples.  In  the  second  set,  the  data  have  been  grouped 
by  value  of  the  Total  Change  data  item.  The  sum,  mean,  variance  and  standard 
deviation  of  each  data  item  in  each  group  has  been  calculated.  The  third  and 
fourth  sets  have  been  processed  similarly  in  terms  of  values  of  the  Known  Errors 
and  Possible  Errors  data  items. 

All  Samples.  Table  I  presents  descriptive  statistics  for  an  average  module. 

These  are  rounded  values  taken  from  the  complete  set  in  appendix  D.  Table  I 
shows  that  the  development  methodology  resulted  in  small,  uncomplicated 
modules  that  required  little  change.  The  main  points  of  interest  are: 

1.  An  average  module  is  small  and  heavily  commented. 

2.  Decision  branching  is  largely  due  to  IF-THEN-ELSES  as  opposed  to 
DOWHILES  or  DOUNTILS. 
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3.  Branching  In  Che  module  Is  largely  due  Co  Che  two  branches  per 
scrucCured  consCrucC  plus  Che  single  reCurn  Co  Che  calling  module. 

4.  The  mean  value  of  McCabe's  Number  Is  low. 

5.  The  mean  value  of  Program  ClarlCy  Is  low. 

6.  The  mean  values  for  Che  measures  of  change  are  low. 


TABLE  I 

Descripclve  Scads Cics  for  an  Average  Module 


Daca  Item 

Mean 

Standard  Deviation 

Processed  Source  Cards 

51 

48 

Comment  Cards 

105 

97 

Tocal  Branching  Instructions 

11 

12 

McCabe  Number 

5.5 

5.8 

Program  Clarity 

42K 

84K 

IF-THEN-ELSES 

4.0 

5.4 

D0HH1LES 

0.1 

0.3 

DOONTILS 

0.4 

0.8 

Total  Changes 

1.4 

2.1 

Known  Errors 

0.7 

1.4 

Possible  Errors 

1.0 

1.7 

Grouped  Samples.  Tables  II,  III,  and  IV  presenC  Che  means  and  scandard 
devlaciona  for  selecCed  daca  icems  grouped  according  Co  Che  number  of 
occurrences  of  Tocal  Changes,  Known  Errors,  and  Possible  Errors  respecclvely. 
The  daca  lcema  presen ced  are  Chose  used  In  Che  Cwo  parameCer  linear  fics 
(Processed  Source,  ToCal  Branching  InsCrucClons,  McCabe  Number,  and  Program 
ClarlCy).  The  values  have  been  rounded  from  Che  resulCs  In  appendix  D. 

In  general,  Che  mean  values  of  Che  grouped  daca  Increase  nonlinear ly 
wlch  measure  of  change.  Specific  points  do  not  conform.  For  instance,  the 
mean  values  at  6  Known  Errors  (2  samples)  and  the  mean  values  ac  6  and  7 
Possible  Errors  (3  samples)  are  low  compared  with  adjacent  groups.  However, 
the  overall  results  indicate  that  larger,  more  complicated  modules  will  be 
changed  more  frequently.  At  Che  same  time,  tables  II,  III,  and  IV  indicate 
that  no  single  source  code  parameter  Is  better  than  the  others  and  chat  no 
single  measure  of  change  Is  superior  to  the  others. 


TABLE  II 


TABLE  III 


TABLE  IV 

Means  (M)  and  Standard  Deviations  (SD)  for  Selected  Data 
Function  of  Occurrences  of  Possible  Errors 
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Another  way  of  looking  at  these  results  is  that  they  partition  into 
three  regions  which  can  be  viewed  as  low,  medium,  and  high  measures  of  the 
particular  type  of  change  (Total  Changes,  Known  Errors,  Possible  Errors). 

Table  V  presents  these  regions  and  the  associated  mean  values  for  Processed 
Source,  Total  Branching  Instructions,  McCabe  Number,  and  Program  Clarity  of 
the  data  within  each  region. 

Table  V  indicates  that  for  this  software  development  environment  it  would 
have  been  possible  to  use  measurements  of  size/structure  parameters  as  a  basis 
for  accepting  modules  that  would  have  low  occurrences  of  change  and  rejecting 
modules  that  would  have  high  occurrences  of  change.  It  can  also  be  seen  from 
table  V  that  the  results  would  be  similar  regardless  of  which  measure  of  change 
is  selected. 


TABLE  V 

Mean  Values  of  Parameters  for  Regions  of  Change 


Low  Change  Medium  Change  High  Change 


Total  Changes 

Processed  Source 
Total  Branching  Instructions 
McCabe  Number 
Program  Clarity 

Known  Errors 

Processed  Source 
Total  Branching  Instructions 
McCabe  Number 
Program  Clarity 

Possible  Errors 

Processed  Source 
Total  Branching  Instructions 
McCabe  Number 
Program  Clarity 


(0,  1) 

(2-7) 

(8-15) 

39 

69 

198 

9 

15 

44 

4.4 

6.8 

21.7 

27K 

55K 

24  IK 

(0) 

(1-4) 

(5-11) 

40 

69 

167 

9 

15 

41 

4.4 

6.9 

18.1 

28K 

55K 

285K 

(0,  1) 

(2-7) 

(8-15) 

40 

77 

237 

9 

17 

57 

4.4 

7.7 

28.4 

28K 

67K 

444K 

CORRELATION  COEFFICIENTS 

Table  VI  presents  the  values  of  the  correlation  coefficients  between 
selected  data  items  and  the  three  measures  of  change  (Total  Changes,  Known 
Errors,  Possible  Errors).  The  table  Includes  Processed  Source,  Total  Branch¬ 
ing  Instructions,  McCabe  Number  and  Program  Clarity.  No  single  data  item 
correlates  well  with  any  of  the  measures  of  change. 
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TABLE  VI 

Correlation  Coefficients  Between  Selected  Data  Items 
and  Measures  of  Change 


Data  Item 

Total  Changes 

Known  Errors 

Possible  Errors 

Processed  Source  Cards 

0.53 

0.51 

0.55 

Comments 

0.49 

0.49 

0.54 

Total  Branching  Instructions 

0.48 

0.49 

0.55 

McCabe  Number 

0.43 

0.46 

0.51 

Program  Clarity 

0.46 

0.51 

0.52 

IF-THEN-ELSES 

0.44 

0.47 

0.51 

DOWHILES 

0.15 

0.19 

0.20 

DOUNTILS 

0.15 

0.14 

0.17 

FUNCTIONAL  RELATIONSHIPS 

An  evaluation  of  the  various  functional  relationships  as  a  source  program 
module  acceptance  criterion  is  presented  in  table  VII.  This  data  has  been  ex¬ 
cerpted  from  the  results  in  appendices  F,  G,  and  H.  Data  values  have  been 
rounded . 

The  functional  relationships  resulting  from  the  linear  and  nonlinear  fits 
have  been  used  to  predict  the  amount  of  change  (Total  Changes,  Known  Errors, 
Possible  Errors)  for  each  module.  The  low/medium  change  boundary  indicated  by 
the  grouped  statistics  of  table  V  has  been  used  as  the  module  acceptance/ 
rejection  criterion.  The  results  as  predicted  from  the  functional  relation¬ 
ships  have  been  compared  with  the  actual  changes  indicated  by  the  data.  The 
evaluation  in  table  VII  is  expressed  as  the  percent  of  modules  correctly 
accepted/rejected  and  the  percent  of  modules  incorrectly  accepted/rejected. 

Table  VII  indicates  that  for  this  software  development  environment  any  of 
the  36  developed  equations  could  have  been  successfully  used  as  a  screening 
criterion.  Correct  acceptance  and  rejection  of  source  modules  would  have 
occurred  in  approximately  70  percent  of  the  cases.  The  remaining  cases  are 
approximately  split  between  incorrect  acceptance  and  incorrect  rejection  of 
the  source  modules. 

Table  VII  also  indicates  that  there  is  no  significant  improvement  in 
acceptance  and  rejection  if  a  larger  number  of  parameters  (multiparameter 
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TABLE  VII 

Evaluation  of  Acceptance  Criterion 
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linear  fit)  or  a  more  complicated  relationship  (nonlinear  fit)  are  used. 

This  is  significant  in  that  it  simplifies  the  practical  problem  of  measuring 
the  source  code  by  reducing  the  number  of  parameters  that  must  be  measured. 

The  coefficient  values  for  the  twelve  equations  resulting  from  the  two 
parameter  linear  fits  are  presented  in  table  VIII.  The  coefficient  values  for 
the  multiparameter  linear  fits  and  the  nonlinear  fits  can  be  found  in  appendices 
G  and  H. 


CONCLUSIONS 


The  goal  of  this  study  was  to  determine  a  functional  relationship  between 
various  parameters  of  the  module  source  code  and  various  measures  of  change  to 
the  module.  This  relationship  would  be  used  as  the  basis  of  an  acceptance 
criterion  for  delivered  source  code.  The  results  of  tables  II,  III,  IV,  V, 
and  VI  show  that  a  single  source  code  parameter  does  not  provide  sufficient 
information  for  an  acceptance  criterion.  The  results  of  table  VII  show  that 
a  linear  equation  in  a  size  parameter  and  a  structural  parameter  provides  a 
sufficient  basis  for  a  practical  acceptance  criterion. 

Table  VII  indicates  that  a  successful  prediction  is  made  by  any  of  the 
functional  relationships  in  about  70  percent  of  the  cases.  In  about  15  percent 
of  the  cases,  the  source  code  is  incorrectly  rejected.  Incorrect  rejection 
represents  a  prediction  of  more  changes  to  the  module  than  would  actually 
result.  The  Implementor  is  required  to  recreate  software  modules  that  actually 
would  have  had  an  acceptable  number  of  errors/changes  because  they  are  evaluated 
as  "too  complicated"  by  the  screening  criterion.  They  will  be  accepted  when 
they  are  simplified.  Over  the  life  cycle  of  the  software  system,  this  clari¬ 
fication  will  have  benefits  as  the  software  is  modified  to  adapt  to  changing 
requirements.  The  re-doing  of  the  software  is  being  performed  in  early  devel¬ 
opment  when  it  costs  the  least. 

In  about  15  percent  of  the  cases,  the  source  code  is  incorrectly  accepted. 
Incorrect  acceptance  represents  a  prediction  of  fewer  changes  to  the  module 
than  will  actually  occur.  These  errors/changes  to  the  module  will  have  to 
be  detected  during  testing  or  operation.  This  is  no  worse  than  the  situation 
presently  accepted  in  software  development. 

Incorrect  acceptance  is  the  least  desired  case  since  it  Involves  changes 
to  the  software  during  the  testing,  Integration,  or  maintenance  phases.  The 
lowest  values  of  incorrect  acceptance  occur  in  table  VII  when  the  measure  of 
change  is  Known  Errors.  For  these  cases,  incorrect  acceptance  of  modules  that 
are  to  have  less  than  one  known  error  will  occur  about  10  percent  of  the  time 
regardless  of  which  of  the  two  parameter  linear  equations  is  selected.  From 
table  VIII,  the  four  equations  are: 

(1)  KE  -  0.014  (PS)  +  0.00042  (MN)  -  0.072 

(2)  KE  -  0.011  (PS)  +  0.015  (TB)  -  0.074 
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TABLE  VIII 

Coefficient  Values  for  the  Two  Parameter  Linear  Fits 


NADC-7 9163-50 


CO 

•u 

c 

rH 

NO 

in 

CO 

NO 

r-. 

On 

CN 

4J 

o 

H 

NO 

cn 

(0 

CN 

CN 

r-* 

m 

c 

• 

• 

• 

• 

0 

o 

O 

o 

o 

o 

o 

S3 

X 

cn 

NO 

ON 

rH 

•a* 

cn 

00 

cn 

CN 

sa* 

cn 

*a* 

rv 

p^ 

CN 

rH 

• 

• 

• 

• 

o 

o 

O 

O 

7  7 


o 

o 

H 

*H 

X 

X 

CN 

On 

On 

rH 

cn 

rH 

Pv 

cn 

O 

NO 

T—i 

CN 

CN 

cn 

CN 

• 

• 

• 

• 

o 

o 

o 

O 

00  CQ 
C  H 
*H 
A 
U  CD 
C  £ 
<0  o 

U  *H 
eg  -u 
a 

rH  3 
<0  M 
4J  U 
O  03 
H  3 


O 

S3 

rH 

a* 

rH 

o 


o 

H 

X 


cn 

PN. 

m 

m 

o 


cu  5 

A 

NO 

1 

o 

S3 

oo 

i 

o 

rH 

X 

i 

o 

rH 

X 

i—i 

i 

o 

S3 

rH 

1 

o 

S3 

to  u 
O  0) 

S  | 

m 

o 

rH 

• 

ON 

>a- 

• 

00 

rH 

-a- 

>a* 

o 

sr 

CN 

cn 

rH 

ON 

cn 

CN 

o 

ss 

? 

o 

o 

o 

• 

o 

• 

o 

00  AJ 
O  -H 

0u  (Q 


cn  m 

'o  'o 

s  * 

no  00 

cn 

oo  on 
oo  ^ 
•  • 
o  o 


cn  cn 
i  I 
o  o 

rH  rH 

X  X 


on  r*. 

no  rH 

no  m 
•  • 
o  o 


cn  cn 
■o  'o 

X  >3 

\o  sa¬ 
ve  sa- 

IO  cn 
vo  .a- 
•  • 
o  o 


■X3  OT 

u>  &« 

to 

to 

4J  CU 

y  u 
o  u 
y  a 
04  o 
co 


•o  'o 

S3  s3 

ON  ON 

cn  h* 
*a-  ^ 
cn  cn 
•  • 
o  o 


'o  'o 

S3  S3 

rH  vO 
CN  00 

<r  o 


o  o 


'o  'o 

CO  oo 
vo  *a 
Is*  o 


O  o 


-h  m  cn 


i-t  cn  cn  sa¬ 


ws 


■H  cn  cn  sy 


-  24  - 


I 


NADC-79163-50 


(3)  KE  -  0.0000067  (PC)  +  0.024  (MN)  +  0.24 

(4)  KE  -  0.0000052  (PC)  +  0.026  (TB)  +  0.14 

where:  KE  Is  Known  Errors. 

PS  Is  Processed  Source. 

MN  Is  McCabe's  Number. 

PC  Is  Program  Clarity. 

TB  Is  Total  Branching  Instructions. 

Choice  of  an  equation  should  be  made  on  the  basis  of  the  difficulty  of 
measuring  a  particular  pair  of  parameters.  For  example,  McCabe's  Number  is 
easily  measured  for  structured  code.  When  the  code  is  not  structured.  Total 
Branching  Instructions  is  easier  to  measure.  As  shown  by  table  VII,  it  does 
not  matter  which  equation  is  selected. 

Table  VII  also  shows  that  there  is  little  gain  in  extending  the  functional 
relationships  to  many  parameters  or  to  more  complicated  functions.  There  is 
small  improvement  in  predicting  ability  between  the  two  parameter  linear  fit 
and  the  multiparameter  and  nonlinear  fits.  This  is  encouraging  since  the 
simpler  relationship  is  more  practical  both  in  terms  of  specifying  software 
contracts  and  in  terms  of  application  when  implementing  the  software. 

While  the  developed  functional  relationships  are  specific  only  for  this 
set  of  data,  the  results  indicate  that  static  analysis  can  produce  a  practical 
screening  criterion  applicable  at  a  very  early  point  in  the  software  develop¬ 
ment  process.  It  is  to  be  expected  that  the  coefficient  values  and  screening 
boundary  will  vary  with  such  development  environment  characteristics  as  pro¬ 
gramming  language,  personnel,  support  facility,  and  methodology.  However,  the 
results  of  this  study  indicate  that  a  simple  relationship  of  size,  structural, 
and  syntactic  parameters  can  be  developed  as  a  practical  screening  criterion. 

If  this  type  of  analysis  were  performed  consistently  over  many  software  devel¬ 
opments,  a  collection  of  relationships  corresponding  to  the  different  environ¬ 
ments  would  result.  The  software  purchaser  could  then  pick  the  criterion 
corresponding  to  his  development  environment.  For  the  present,  the  developed 
relationships  can  be  applied  to  those  environments  similar  to  that  of  refer¬ 
ence  (n). 


RECOMMENDATIONS 

It  is  recommended  that  the  results  of  this  study  be  verified  by  expanding 
the  analysis  to  other  software  development  environments.  These  environments 
should  reflect  present  DOD  policy  with  regard  to  HOL's  (high  order  languages) 
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and  modern  software  development  methodologies.  At  the  NAVAIRDEVCEN,  this  is 
embodied  in  FASP.  FASP  has  the  following  advantages  for  the  purposes  of  this 
analysis : 

1.  It  is  utilized  by  large  software  development  and  maintenance  projects. 
Large  developments  are  necessary  in  order  to  generate  sufficient  data 
for  analysis.  Large  developments  also  represent  the  most  valuable 
point  of  application  for  the  screening  criterion. 

2.  Tools  to  measure  the  size,  structure,  and  syntax  of  software  modules 

and  to  calculate  the  predicted  change  to  those  modules  may  be  implemented 
in  FASP  as  preprocessors  to  the  language  translators.  This  assures  that 
measurement  will  be  made  each  time  a  module  is  translated. 

3.  FASP  Software  Management  Reports  already  track  size  and  change  at  the 
module  level.  Expansion  of  this  tracking  to  include  structure  and  syntax, 
predicated  change,  and  to  differentiate  between  error  correction  and 
module  modification  is  feasible. 

It  is  therefore  recommended  that  measuring  tools  be  implemented  in  FASP  for 
the  following  military  computer  languages: 


COMPUTER 

HOL 

AL 

NAVAIRDEVCEN  PROJECTS 

AN/UYK-7 

CMS-2Y 

ULTRA-32 

CV-TSC ,  S-3A 

AN/UYK-20 

CMS-2M 

SPL/I(M) 

MACRO-20/ 14 

CV-TSC,  LAMPS 

AN/UYS-I 

SPL/I 

SPL 

ASP,  P-3C,  LAMPS 

It  is  also  recommended  that  the  FASP  Software  Management  Reports  be  modi¬ 
fied  to  allow  tracking  of  module  structure  and  syntax,  tracking  of  predicted 
change,  and  to  allow  user  specification  of  change  category  (error  correction 
or  module  modification). 

Following  implementation  of  these  changes,  it  is  recommended  that  a  three 
year  study  be  performed  for  those  projects  developing  and/or  maintaining  software 
in  the  above  languages  on  FASP.  It  is  expected  that  this  study  will  result  in  a 
family  of  screening  criteria  applicable  to  specific  languages  utilizing  the  FASP 
software  development  environment. 

It  is  further  recommended  that  those  projects  at  NAVAIRDEVCEN  utilizing  a 
software  development  environment  similar  to  the  one  of  the  present  study  be 
required  to: 

1.  Utilize  the  indicated  screening  criterion  as  an  element  of  their  soft¬ 
ware  module  acceptance  procedure. 

2.  Historically  track  both  the  measures  of  the  modules  and  their  changes. 
This  data  shall  be  used  to  verify  or  improve  the  screening  criterion. 


-  26  - 


t 


■jU 


NADC-79163-50 


REFERENCES 

(a)  Naval  Air  Development  Center  FASP  (Facility  for  Automated  Software 
Production)  Brochure,  Nov  1977. 

(b)  Dennis  W.  Farrell,  "Navy  Airborne  Weapon  System  Software  Acquisition," 
Defense  Systems  Management  Review,  Vol.  1,  No.  6. 

(c)  W. Myers,  "The  Need  for  Software  Engineering,"  Computer,  Feb  1978, 
pp.  12-26. 

(d)  Glenford  J.  Myers,  "Software  Reliability,  Principles  and  Practices," 

John  Wiley  and  Sons,  1976. 

(e)  G.  Schick  and  R.  Wolverton,  "An  Analysis  of  Competing  Software  Reliability 
Models,"  IEEE  Transactions  on  Software  Engineering,  Vol.  SE-4,  No.  2, 

Mar  1978,  pp.  104-120. 

(f)  B.W.  Boehm,  "Software  and  Its  Impact:  A  Quantitative  Assessment," 
Datamation,  May  1973. 

(g)  Honeywell,  Incorporated,  "Software  Management  Seminar"  Brochure,  Jan  1978. 

(h)  R.  Wolverton,  "The  Cost  of  Developing  Large-Scale  Software,"  IEEE  Trans¬ 
actions  on  Computers,  Vol.  C-23,  No.  6,  Jun  1974,  pp.  615-636. 

(i)  B.  W.  Boehm,  "Software  Requirements  and  Design  Aids,"  in  "Software 
Reliability:  Infotech  State  of  the  Art  Report;  2:  Invited  Papers," 
Infotech  International,  1977. 

(j)  T.J. McCabe,  "A  Complexity  Measure,"  IEEE  Transactions  on  Software  Engineer¬ 
ing,  Vol.  SE-2,  No.  4,  Dec  1976,  pp.  308-320. 

(k)  Maurice  H.  Halstead,  "Elements  of  Software  Science,"  Elsevier  North- 
Holland,  Inc.  1977. 

(l)  T.F. Green,  G.T. Howard,  R.J.Pariseau,  N.F.Schneidewind,  "Program  Structures, 
Complexity,  and  Error  Characteristics,"  presented  at  the  Symposium  on 
Computer  Software  Engineering,  Polytechnic  Institute  of  New  York, 

20-22  Apr  1976. 

(m)  G.T.  Howard,  M.Kirchgaessner ,  N.F.Schneidewind,  Report  No.  NPS52SS76111, 
"Software  Error  Detection:  Models,  Validation  Tests,  and  Program  Com¬ 
plexity  Measures,"  Naval  Postgraduate  School,  No.  1976. 

(n)  R.J.Pariseau,  Report  No.  NADC-76044-50,  "Improved  Software  Productivity 
for  Military  Computer  Systems  Through  Structured  Programming,"  Naval  Air 
Development  Center,  12  Mar  1976. 

(o)  "Control  Data  6000  Series  Computer  Systems  Statistical  Subroutines 
Reference  Manual,"  Publication  No.  60135300,  Control  Data  Corporation, 

June  1966. 


DISTRIBUTION  LIST 


REPORT  NO.  NADC-79163-50 

AIRTASK  NO.  ZF6I412001 
Work  Unit  No.  GC333 


NAVAIRSYSCOM,  AIR-950D 
(2  for  retention) 
(2  for  AIR-530) 

DDC . 


